Skip to content

yaml2obj: Implement Coverage mapping format #129472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Conversation

chapuni
Copy link
Contributor

@chapuni chapuni commented Mar 3, 2025

yaml2obj can understand and encode CovMap sections without any options.

For now, this is dedicated to ELF. Affected sections are as below:

  • __llvm_covmap
  • __llvm_covfun
  • __llvm_prf_names

obj2yaml will follow with an additional option --covmap.

There are a few namespaces. Other implementations are sunk into anonymous namespace.

  • llvm::coverage::yaml contains YAML definitions and encoder/decoder interfaces of CovMap data. This can be split out of ObjectYAML and embedded into other component like Coverage.
  • llvm::covmap defines interfaces to ObjectYAML.

@chapuni chapuni requested review from MaskRay, ornata and evodius96 March 3, 2025 04:03
@llvmbot llvmbot added bazel "Peripheral" support tier build system: utils/bazel objectyaml labels Mar 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 3, 2025

@llvm/pr-subscribers-objectyaml

Author: NAKAMURA Takumi (chapuni)

Changes

yaml2obj can understand and encode CovMap sections without any options.

For now, this is dedicated to ELF. Affected sections are as below:

  • __llvm_covmap
  • __llvm_covfun
  • __llvm_prf_names

obj2yaml will follow with an additional option --covmap.

There are a few namespaces. Other implementations are sunk into anonymous namespace.

  • llvm::coverage::yaml contains YAML definitions and encoder/decoder interfaces of CovMap data. This can be split out of ObjectYAML and embedded into other component like Coverage.
  • llvm::covmap defines interfaces to ObjectYAML.

Patch is 40.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/129472.diff

9 Files Affected:

  • (added) llvm/include/llvm/ObjectYAML/CovMap.h (+267)
  • (modified) llvm/include/llvm/ObjectYAML/ELFYAML.h (+14)
  • (modified) llvm/lib/ObjectYAML/CMakeLists.txt (+3)
  • (added) llvm/lib/ObjectYAML/CovMap.cpp (+485)
  • (modified) llvm/lib/ObjectYAML/ELFEmitter.cpp (+38)
  • (modified) llvm/lib/ObjectYAML/ELFYAML.cpp (+13-1)
  • (added) llvm/test/tools/obj2yaml/ELF/covmap-be.yaml (+81)
  • (added) llvm/test/tools/obj2yaml/ELF/covmap.yaml (+151)
  • (modified) utils/bazel/llvm-project-overlay/llvm/BUILD.bazel (+2)
diff --git a/llvm/include/llvm/ObjectYAML/CovMap.h b/llvm/include/llvm/ObjectYAML/CovMap.h
new file mode 100644
index 0000000000000..3a0b86435d490
--- /dev/null
+++ b/llvm/include/llvm/ObjectYAML/CovMap.h
@@ -0,0 +1,267 @@
+//===- CovMap.h - ObjectYAML Interface for coverage map ---------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// - llvm::coverage::yaml
+//
+//   Describes binary file formats and YAML structures of coverage map.
+//
+// - llvm::yaml
+//
+//   Attachments for YAMLTraits.
+//
+// - llvm::covmap
+//
+//   Provides YAML encoder for coverage map.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_OBJECTYAML_COVMAP_H
+#define LLVM_OBJECTYAML_COVMAP_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ObjectYAML/ELFYAML.h"
+#include "llvm/Support/Endian.h"
+#include "llvm/Support/YAMLTraits.h"
+#include <array>
+#include <cstdint>
+#include <memory>
+#include <optional>
+#include <string>
+#include <vector>
+
+namespace llvm {
+class raw_ostream;
+} // namespace llvm
+
+namespace llvm::coverage::yaml {
+
+/// Base Counter, corresponding to coverage::Counter.
+struct CounterTy {
+  enum TagTy : uint8_t {
+    Zero = 0,
+    Ref,
+    Sub,
+    Add,
+  };
+
+  /// Optional in detailed view, since most Tag can be determined from
+  /// other optional fields.
+  std::optional<TagTy> Tag;
+
+  /// Internal use.
+  std::optional<uint64_t> Val;
+
+  std::optional<uint64_t> RefOpt;
+  std::optional<uint64_t> SubOpt;
+  std::optional<uint64_t> AddOpt;
+
+  virtual ~CounterTy() {}
+
+  virtual void mapping(llvm::yaml::IO &IO);
+
+  void encode(raw_ostream &OS) const;
+};
+
+/// Holds a pair of both hands but doesn't hold ops(add or sub).
+/// Ops is stored in CounterTy::Tag.
+using ExpressionTy = std::array<CounterTy, 2>;
+
+/// {True, False}
+using BranchTy = std::array<CounterTy, 2>;
+
+/// {ID, TrueID, FalseID}
+/// Note: This has +1 offset unlike mcdc::ConditionID.
+using MCDCBranchTy = std::array<uint16_t, 3>;
+
+struct DecisionTy {
+  uint64_t BIdx; ///< Bitmap index
+  uint64_t NC;   ///< NumConds
+
+  void mapping(llvm::yaml::IO &IO);
+
+  void encode(raw_ostream &OS) const;
+};
+
+/// {LineStart, ColumnStart, LineEnd, ColumnEnd}
+using LocTy = std::array<uint64_t, 4>;
+
+///
+struct RecTy : CounterTy {
+  enum ExtTagTy : uint8_t {
+    Skip = 2,
+    Branch = 4,
+    Decision = 5,
+    MCDCBranch = 6,
+  };
+
+  /// This is optional in detailed view.
+  std::optional<ExtTagTy> ExtTag;
+
+  // Options for extensions.
+  std::optional<uint64_t> Expansion; ///< Doesn't have ExtTag.
+  std::optional<BranchTy> BranchOpt; ///< Optionally has MCDC.
+  std::optional<MCDCBranchTy> MCDC;
+  std::optional<DecisionTy> DecisionOpt;
+
+  /// True or None.
+  /// Stored in ColumnEnd:31.
+  std::optional<bool> isGap;
+
+  std::optional<LocTy> Loc;  ///< Absolute line numbers.
+  std::optional<LocTy> dLoc; ///< Differential line numbers.
+
+  void mapping(llvm::yaml::IO &IO) override;
+
+  void encode(uint64_t &StartLoc, raw_ostream &OS) const;
+};
+
+/// {NumRecs, Recs...}
+struct FileRecsTy {
+  std::optional<unsigned> Index;       ///< Shown in detailed view.
+  std::optional<std::string> Filename; ///< Resolved by FileIDs.
+  std::vector<RecTy> Recs;
+
+  void mapping(llvm::yaml::IO &IO);
+};
+
+/// An element of CovFun array.
+struct CovFunTy {
+  std::optional<llvm::yaml::Hex64> NameRef;     ///< Hash value of the symbol.
+  std::optional<std::string> FuncName;          ///< Resolved by symtab.
+  llvm::yaml::Hex64 FuncHash;                   ///< Signature of this function.
+  llvm::yaml::Hex64 FilenamesRef;               ///< Pointer to CovMap
+  std::optional<std::vector<unsigned>> FileIDs; ///< Resolved by CovMap
+  std::vector<ExpressionTy> Expressions;
+  std::vector<FileRecsTy> Files; ///< 2-dimension array of Recs.
+
+  void mapping(llvm::yaml::IO &IO);
+
+  void encode(raw_ostream &OS, endianness Endianness) const;
+};
+
+/// An element of CovMap array.
+struct CovMapTy {
+  /// This is the key of CovMap but not present in the file
+  /// format. Calculate and store with Filenames.
+  llvm::yaml::Hex64 FilenamesRef;
+
+  std::optional<uint32_t> Version;
+
+  /// Raw Filenames (and storage of Files)
+  std::optional<std::vector<std::string>> Filenames;
+
+  /// Since Version5: Filenames[0] is the working directory (or
+  /// zero-length string). Note that indices in CovFun::FileIDs is
+  /// base on Filenames. (Then, +0, as WD, is not expected to appear)
+  std::optional<std::string> WD;
+  /// This may be ArrayRef in Decoder since Filenames has been
+  /// filled. On the other hand in Encoder, this should be a vector
+  /// since YAML parser doesn't endorse references.
+  std::optional<std::vector<std::string>> Files;
+
+  void mapping(llvm::yaml::IO &IO);
+
+  bool useWD() const { return (!Version || *Version >= 4); }
+  StringRef getWD() const { return (WD ? *WD : StringRef()); }
+
+  /// Generate Accumulated list with WD.
+  /// Returns a single element {WD} if AccFiles is not given.
+  std::vector<std::string>
+  generateAccFilenames(const std::optional<ArrayRef<StringRef>> &AccFilesOpt =
+                           std::nullopt) const;
+  /// Regenerate Filenames with WD.
+  /// Use Files if it is not None. Or given AccFiles is used.
+  void
+  regenerateFilenames(const std::optional<ArrayRef<StringRef>> &AccFilesOpt);
+
+  /// Encode Filenames. This is mostly used just to obtain FilenamesRef.
+  std::pair<uint64_t, std::string> encodeFilenames(
+      const std::optional<ArrayRef<StringRef>> &AccFilesOpt = std::nullopt,
+      bool Compress = false) const;
+
+  void encode(raw_ostream &OS, endianness Endianness) const;
+};
+
+} // namespace llvm::coverage::yaml
+
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::CovMapTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::CovFunTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::ExpressionTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::RecTy)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::coverage::yaml::FileRecsTy)
+LLVM_YAML_IS_FLOW_SEQUENCE_VECTOR(llvm::coverage::yaml::CounterTy)
+
+#define LLVM_COVERAGE_YAML_ELEM_MAPPING(Ty)                                    \
+  namespace llvm::yaml {                                                       \
+  template <> struct MappingTraits<llvm::coverage::yaml::Ty> {                 \
+    static void mapping(IO &IO, llvm::coverage::yaml::Ty &Obj) {               \
+      Obj.mapping(IO);                                                         \
+    }                                                                          \
+  };                                                                           \
+  }
+
+/// `Flow` is used for emission of a compact oneliner for RecTy.
+#define LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(Ty)                               \
+  namespace llvm::yaml {                                                       \
+  template <> struct MappingTraits<llvm::coverage::yaml::Ty> {                 \
+    static void mapping(IO &IO, llvm::coverage::yaml::Ty &Obj) {               \
+      Obj.mapping(IO);                                                         \
+      (void)flow;                                                              \
+    }                                                                          \
+    static const bool flow = true;                                             \
+  };                                                                           \
+  }
+
+#define LLVM_COVERAGE_YAML_ENUM(Ty)                                            \
+  namespace llvm::yaml {                                                       \
+  template <> struct ScalarEnumerationTraits<llvm::coverage::yaml::Ty> {       \
+    static void enumeration(IO &IO, llvm::coverage::yaml::Ty &Value);          \
+  };                                                                           \
+  }
+
+LLVM_COVERAGE_YAML_ENUM(CounterTy::TagTy)
+LLVM_COVERAGE_YAML_ENUM(RecTy::ExtTagTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(CounterTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(DecisionTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING_FLOW(RecTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING(FileRecsTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING(CovFunTy)
+LLVM_COVERAGE_YAML_ELEM_MAPPING(CovMapTy)
+
+namespace llvm::covmap {
+
+class Encoder {
+protected:
+  endianness Endianness;
+
+public:
+  Encoder(endianness Endianness) : Endianness(Endianness) {}
+  virtual ~Encoder() {}
+
+  /// Returns EncoderImpl.
+  static std::unique_ptr<Encoder> get(endianness Endianness);
+
+  /// Called from the Sections loop.
+  virtual void collect(ELFYAML::Chunk *Chunk) = 0;
+
+  /// Resolves names from CovFuns in advance of Emitter. It'd be too
+  /// late to resolve sections in Emitter since they are immutable
+  /// then.
+  virtual void fixup() = 0;
+};
+
+/// Returns whether Name is interested.
+bool nameMatches(StringRef Name);
+
+/// Returns a new ELFYAML Object.
+std::unique_ptr<ELFYAML::CovMapSectionBase> make_unique(StringRef Name);
+
+} // namespace llvm::covmap
+
+#endif // LLVM_OBJECTYAML_COVMAP_H
diff --git a/llvm/include/llvm/ObjectYAML/ELFYAML.h b/llvm/include/llvm/ObjectYAML/ELFYAML.h
index dfdfa055d65fa..e9bb7621b20d9 100644
--- a/llvm/include/llvm/ObjectYAML/ELFYAML.h
+++ b/llvm/include/llvm/ObjectYAML/ELFYAML.h
@@ -229,6 +229,7 @@ struct Chunk {
     DependentLibraries,
     CallGraphProfile,
     BBAddrMap,
+    CovMapBase,
 
     // Special chunks.
     SpecialChunksStart,
@@ -398,6 +399,19 @@ struct RawContentSection : Section {
   std::optional<std::vector<uint8_t>> ContentBuf;
 };
 
+struct CovMapSectionBase : Section {
+  std::optional<llvm::yaml::Hex64> Info;
+
+  CovMapSectionBase() : Section(ChunkKind::CovMapBase) {}
+
+  virtual void mapping(yaml::IO &IO) = 0;
+  virtual Error encode(raw_ostream &OS, endianness Endianness) const = 0;
+
+  static bool classof(const Chunk *S) {
+    return S->Kind == ChunkKind::CovMapBase;
+  }
+};
+
 struct NoBitsSection : Section {
   NoBitsSection() : Section(ChunkKind::NoBits) {}
 
diff --git a/llvm/lib/ObjectYAML/CMakeLists.txt b/llvm/lib/ObjectYAML/CMakeLists.txt
index b36974d47d9f8..11054a1e91388 100644
--- a/llvm/lib/ObjectYAML/CMakeLists.txt
+++ b/llvm/lib/ObjectYAML/CMakeLists.txt
@@ -7,6 +7,7 @@ add_llvm_component_library(LLVMObjectYAML
   CodeViewYAMLTypes.cpp
   COFFEmitter.cpp
   COFFYAML.cpp
+  CovMap.cpp
   DWARFEmitter.cpp
   DWARFYAML.cpp
   DXContainerEmitter.cpp
@@ -34,7 +35,9 @@ add_llvm_component_library(LLVMObjectYAML
 
   LINK_COMPONENTS
   BinaryFormat
+  Coverage
   Object
+  ProfileData
   Support
   TargetParser
   DebugInfoCodeView
diff --git a/llvm/lib/ObjectYAML/CovMap.cpp b/llvm/lib/ObjectYAML/CovMap.cpp
new file mode 100644
index 0000000000000..7662284caee76
--- /dev/null
+++ b/llvm/lib/ObjectYAML/CovMap.cpp
@@ -0,0 +1,485 @@
+//===- CovMap.cpp - ObjectYAML Interface for coverage map -----------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Implementations of CovMap and encoder.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/ObjectYAML/CovMap.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ObjectYAML/ELFYAML.h"
+#include "llvm/ProfileData/Coverage/CoverageMappingWriter.h"
+#include "llvm/ProfileData/InstrProf.h"
+#include "llvm/Support/Alignment.h"
+#include "llvm/Support/Endian.h"
+#include "llvm/Support/LEB128.h"
+#include "llvm/Support/MD5.h"
+#include "llvm/Support/YAMLTraits.h"
+#include "llvm/Support/raw_ostream.h"
+#include <algorithm>
+#include <cstdint>
+#include <memory>
+#include <optional>
+#include <string>
+#include <utility>
+#include <vector>
+
+#define COVMAP_V3
+
+using namespace llvm;
+using namespace llvm::coverage::yaml;
+using namespace llvm::covmap;
+
+void CounterTy::encode(raw_ostream &OS) const {
+  std::pair<unsigned, uint64_t> C;
+  if (RefOpt)
+    C = {Ref, *RefOpt};
+  else if (SubOpt)
+    C = {Sub, *SubOpt};
+  else if (AddOpt)
+    C = {Add, *AddOpt};
+  else if (Tag && *Tag == Zero)
+    C = {Zero, 0u};
+  else if (Tag && Val)
+    C = {*Tag, *Val};
+  else
+    llvm_unreachable("Null value cannot be met");
+
+  encodeULEB128(C.first | (C.second << 2), OS);
+}
+
+void DecisionTy::encode(raw_ostream &OS) const {
+  encodeULEB128(BIdx, OS);
+  encodeULEB128(NC, OS);
+}
+
+void RecTy::encode(uint64_t &StartLoc, raw_ostream &OS) const {
+  if (Expansion) {
+    encodeULEB128(4 + (*Expansion << 3), OS);
+  } else if (ExtTag && *ExtTag == Skip) {
+    encodeULEB128(2 << 3, OS);
+  } else if (DecisionOpt) {
+    assert(!ExtTag || *ExtTag == Decision);
+    encodeULEB128(5 << 3, OS);
+    DecisionOpt->encode(OS);
+  } else if (MCDC) {
+    assert(!ExtTag || *ExtTag == MCDCBranch);
+    assert(BranchOpt);
+    encodeULEB128(6 << 3, OS);
+    (*BranchOpt)[0].encode(OS);
+    (*BranchOpt)[1].encode(OS);
+    encodeULEB128((*MCDC)[0], OS);
+    encodeULEB128((*MCDC)[1], OS);
+    encodeULEB128((*MCDC)[2], OS);
+  } else if (BranchOpt) {
+    assert(!ExtTag || *ExtTag == Branch);
+    encodeULEB128(4 << 3, OS);
+    (*BranchOpt)[0].encode(OS);
+    (*BranchOpt)[1].encode(OS);
+  } else {
+    // Non-tag CounterTy
+    CounterTy::encode(OS);
+  }
+
+  assert((!isGap || *isGap) && "Don't set isGap=false");
+  uint32_t Gap = (isGap ? (1u << 31) : 0u);
+  if (Loc) {
+    encodeULEB128((*Loc)[0] - StartLoc, OS);
+    encodeULEB128((*Loc)[1], OS);
+    encodeULEB128((*Loc)[2] - (*Loc)[0], OS);
+    encodeULEB128((*Loc)[3] | Gap, OS);
+    StartLoc = (*Loc)[0];
+  } else {
+    encodeULEB128((*dLoc)[0], OS);
+    encodeULEB128((*dLoc)[1], OS);
+    encodeULEB128((*dLoc)[2], OS);
+    encodeULEB128((*dLoc)[3] | Gap, OS);
+  }
+}
+
+void CovFunTy::encode(raw_ostream &OS, endianness Endianness) const {
+  // Encode Body in advance since DataSize should be known.
+  std::string Body;
+  raw_string_ostream SS(Body);
+
+  assert(FileIDs);
+  encodeULEB128(FileIDs->size(), SS);
+  for (auto I : *FileIDs)
+    encodeULEB128(I, SS);
+
+  encodeULEB128(Expressions.size(), SS);
+  for (const auto &[LHS, RHS] : Expressions) {
+    LHS.encode(SS);
+    RHS.encode(SS);
+  }
+
+  for (const auto &File : Files) {
+    encodeULEB128(File.Recs.size(), SS);
+    uint64_t StartLoc = 0;
+    for (const auto &Rec : File.Recs)
+      Rec.encode(StartLoc, SS);
+  }
+
+  // Emit the Header
+  uint64_t NameRef = (this->NameRef ? static_cast<uint64_t>(*this->NameRef)
+                                    : MD5Hash(*this->FuncName));
+  uint32_t DataSize = Body.size();
+  /* this->FuncHash */
+  char CoverageMapping = 0; // dummy
+  /* this->FilenamesRef */
+
+#define COVMAP_FUNC_RECORD(Type, LLVMType, Name, Initializer)                  \
+  if (sizeof(Name) > 1) {                                                      \
+    Type t = support::endian::byte_swap(Name, Endianness);                     \
+    OS << StringRef(reinterpret_cast<const char *>(&t), sizeof(t));            \
+  }
+#include "llvm/ProfileData/InstrProfData.inc"
+
+  // Emit the body.
+  OS << std::move(Body);
+}
+
+std::vector<std::string> CovMapTy::generateAccFilenames(
+    const std::optional<ArrayRef<StringRef>> &AccFilesOpt) const {
+  std::vector<std::string> Result;
+  if (useWD())
+    Result.push_back(getWD().str());
+  // Returns {WD} if AccFiles is None.
+  if (AccFilesOpt) {
+    for (auto &Filename : *AccFilesOpt)
+      Result.push_back(Filename.str());
+  }
+  return Result;
+}
+
+void CovMapTy::regenerateFilenames(
+    const std::optional<ArrayRef<StringRef>> &AccFilesOpt) {
+  assert(!this->Filenames);
+  if (this->Files) {
+    auto &CovMapFilenames = this->Filenames.emplace(generateAccFilenames());
+    assert(CovMapFilenames.size() <= 1);
+    for (auto &&File : *this->Files)
+      CovMapFilenames.push_back(std::move(File));
+  } else {
+    // Encode Accfiles, that comes from CovFun.
+    this->Filenames = generateAccFilenames(AccFilesOpt);
+  }
+}
+
+std::pair<uint64_t, std::string>
+CovMapTy::encodeFilenames(const std::optional<ArrayRef<StringRef>> &AccFilesOpt,
+                          bool Compress) const {
+  ArrayRef<std::string> TempFilenames;
+  std::vector<std::string> AccFilenames; // Storage
+
+  if (AccFilesOpt) {
+    AccFilenames = generateAccFilenames(AccFilesOpt);
+    TempFilenames = AccFilenames;
+  } else {
+    assert(this->Filenames);
+    TempFilenames = ArrayRef(*this->Filenames);
+  }
+
+  std::string FilenamesBlob;
+  llvm::raw_string_ostream OS(FilenamesBlob);
+  CoverageFilenamesSectionWriter(TempFilenames).write(OS, Compress);
+
+  return {llvm::IndexedInstrProf::ComputeHash(FilenamesBlob), FilenamesBlob};
+}
+
+void CovMapTy::encode(raw_ostream &OS, endianness Endianness) const {
+  auto [FilenamesRef, FilenamesBlob] = encodeFilenames();
+
+  uint32_t NRecords = 0;
+  uint32_t FilenamesSize = FilenamesBlob.size();
+  uint32_t CoverageSize = 0;
+  uint32_t Version =
+      (this->Version ? *this->Version : INSTR_PROF_COVMAP_VERSION);
+  struct {
+#define COVMAP_HEADER(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+  } CovMapHeader = {
+#define COVMAP_HEADER(Type, LLVMType, Name, Initializer)                       \
+  support::endian::byte_swap(Name, Endianness),
+#include "llvm/ProfileData/InstrProfData.inc"
+  };
+  StringRef HeaderBytes(reinterpret_cast<char *>(&CovMapHeader),
+                        sizeof(CovMapHeader));
+  OS << HeaderBytes;
+
+  // llvm_covmap's alignment
+  FilenamesBlob.resize(llvm::alignTo(FilenamesBlob.size(), sizeof(uint32_t)));
+  OS << FilenamesBlob;
+}
+
+void CounterTy::mapping(llvm::yaml::IO &IO) {
+  IO.mapOptional("Tag", Tag);
+  IO.mapOptional("Val", Val);
+  IO.mapOptional("Ref", RefOpt);
+  IO.mapOptional("Sub", SubOpt);
+  IO.mapOptional("Add", AddOpt);
+}
+
+void DecisionTy::mapping(llvm::yaml::IO &IO) {
+  IO.mapRequired("BIdx", BIdx);
+  IO.mapRequired("NCond", NC);
+}
+
+void RecTy::mapping(llvm::yaml::IO &IO) {
+  IO.mapOptional("Loc", Loc);
+  IO.mapOptional("dLoc", dLoc);
+  IO.mapOptional("isGap", isGap);
+  CounterTy::mapping(IO);
+  IO.mapOptional("ExtTag", ExtTag);
+  IO.mapOptional("Expansion", Expansion);
+  IO.mapOptional("Branch", BranchOpt);
+  IO.mapOptional("MCDC", MCDC);
+  IO.mapOptional("Decision", DecisionOpt);
+}
+
+void FileRecsTy::mapping(llvm::yaml::IO &IO) {
+  IO.mapOptional("Index", Index);
+  IO.mapOptional("Filename", Filename);
+  IO.mapRequired("Regions", Recs);
+}
+
+void CovFunTy::mapping(llvm::yaml::IO &IO) {
+  IO.mapOptional("NameRef", NameRef);
+  IO.mapOptional("FuncName", FuncName);
+  IO.mapRequired("FuncHash", FuncHash);
+  IO.mapRequired("FilenamesRef", FilenamesRef);
+  IO.mapOptional("FileIDs", FileIDs);
+  IO.mapRequired("Expressions", Expressions);
+  IO.mapRequired("Files", Files);
+}
+
+void CovMapTy::mapping(llvm::yaml::IO &IO) {
+  IO.mapRequired("FilenamesRef", FilenamesRef);
+  IO.mapOptional("Version", Version);
+  IO.mapOptional("Filenames", Filenames);
+  IO.mapOptional("WD", WD);
+  IO.mapOptional("Files", Files);
+}
+
+#define ECase(N, X) IO.enumCase(Value, #X, N::X)
+
+void llvm::yaml::ScalarEnumerationTraits<CounterTy::TagTy>::enumeration(
+    llvm::yaml::IO &IO, CounterTy::TagTy &Value) {
+  ECase(CounterTy, Zero);
+  ECase(CounterTy, Ref);
+  ECase(CounterTy, Sub);
+  ECase(CounterTy, Add);
+}
+
+void llvm::yaml::ScalarEnumerationTraits<RecTy::ExtTagTy>::enumeration(
+    llvm::yaml::IO &IO, RecTy::ExtTagTy &Value) {
+  ECase(RecTy, Skip);
+  ECase(RecTy, Branch);
+  ECase(RecTy, Decision);
+  ECase(RecTy, MCDCBranch);
+}
+
+namespace {
+
+struct PrfNamesSection : ELFYAML::CovMapSectionBase {
+  using PrfNamesTy = SmallVector<std::string>;
+  SmallVector<PrfNamesTy, 1> PrfNames;
+
+  PrfNamesSection() { Name = "__llvm_prf_names"; }
+  static bool nameMatches(StringRef Name) { return Name == "__llvm_prf_names"; }
+  static bool class...
[truncated]

@chapuni chapuni requested review from jh7370 and removed request for keith, rupprecht and aaronmondal March 3, 2025 04:29
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point on one of the related PRs still stands: this is too much code at once for a person to be able to reliably review. You don't need to have yaml2obj fully functional in the first PR. Instead, add support piece by piece, the smaller the pieces, the better, as then it becomes much easier to review each bit independently.

Skimming this patch and one obvious thing you could do is to get rid of all the optional fields in the first instance. There are probably multiple other things you can do to improve things to. In the end, I'd expect a whole series of linked PRs, built on top of each other. One of the early ones would introduce a test that shows a minimal support that may well not comply to the section specification yet (perhaps it simply recognises the kind of section). The test could then be extended as each piece is added, which also helps ensure we have full test coverage.

@jh7370
Copy link
Collaborator

jh7370 commented Mar 7, 2025

If this is ready for a round of review, please let me know. Please note that I have scheduled time off between the 14th and 28th of this month, so won't be able to review during this time. @MaskRay may be a good alternative reviewer of the yaml2obj stuff, if he's available.


/// Base Counter, corresponding to coverage::Counter.
struct CounterTy {
enum TagTy : uint8_t {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the name "TagTy"?

this refers to the type of operation, correct?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to call this "OpTy" or something?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
enum TagTy : uint8_t {
enum OperationTy : uint8_t {

};

TagTy Tag;
uint64_t Val;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Val is the value of the counter? Can you document that?


virtual void mapping(llvm::yaml::IO &IO);

void encode(raw_ostream &OS) const;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a doxygen comment for this? e.g.

Encodes \p OS using ULEB128.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, would it possibly be useful to return the unsigned from ULEB128?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void encode(raw_ostream &OS) const;
/// Encodes \p OS using ULEB128.
void encode(raw_ostream &OS) const;

/// Note: This has +1 offset unlike mcdc::ConditionID.
using MCDCBranchTy = std::array<uint16_t, 3>;

struct DecisionTy {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write a doxygen comment for this struct?

/// {LineStart, ColumnStart, LineEnd, ColumnEnd}
using LocTy = std::array<uint64_t, 4>;

///
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you fill this out?

using ExpressionTy = std::array<CounterTy, 2>;

/// {True, False}
using BranchTy = std::array<CounterTy, 2>;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be OK to use a struct here?

struct BranchTy {
  CounterTy True;
  CounterTy False;
};

Then later, when you're encoding, it would be easer to read, because you could encode True and False directly rather than [0] and [1].

If it is considerably more performant to use std::array, then it's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make yaml text compact.

} else if (ExtTag && *ExtTag == Skip) {
encodeULEB128(2 << 3, OS);
} else if (DecisionOpt) {
assert(!ExtTag || *ExtTag == Decision);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would !ExtTag be true here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If DecisionOpt exists, ExtTag should be Decision or shouldn't exist.

encodeULEB128(5 << 3, OS);
DecisionOpt->encode(OS);
} else if (MCDC) {
assert(!ExtTag || *ExtTag == MCDCBranch);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would !ExtTag be true here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of MCDC shall identify this record as MCDCBranch.

encodeULEB128((*MCDC)[1], OS);
encodeULEB128((*MCDC)[2], OS);
} else if (BranchOpt) {
assert(!ExtTag || *ExtTag == Branch);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would !ExtTag be true here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!MCDC && BranchOpt shall identify this record as Branch.

void encode(uint64_t &StartLoc, raw_ostream &OS) const;
};

/// {NumRecs, Recs...}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumRecs not present?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you add a comment that describes what this does in the context of the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to mention here, NumRecs is in encoded file. Is this noisy?

@chapuni
Copy link
Contributor Author

chapuni commented Mar 14, 2025

@MaskRay I suppose you are an appropriate reviewer since you have experiences in both covmap and objyaml. Could you take a look at the series of patches?

Patches are ready for review.

@chapuni chapuni added coverage and removed bazel "Peripheral" support tier build system: utils/bazel labels Mar 15, 2025
Copy link
Contributor Author

@chapuni chapuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically and generally, these definitions follow and assume knowledge of the coverage mapping file format and the internal implementation.

If there would be diversion between the file format and the internal implementation, we will follow the file format.

using ExpressionTy = std::array<CounterTy, 2>;

/// {True, False}
using BranchTy = std::array<CounterTy, 2>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make yaml text compact.

// Options for extensions.
std::optional<uint64_t> Expansion; ///< Doesn't have ExtTag.
std::optional<BranchTy> BranchOpt; ///< Optionally has MCDC.
std::optional<MCDCBranchTy> MCDC;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be used optionally.

void encode(uint64_t &StartLoc, raw_ostream &OS) const;
};

/// {NumRecs, Recs...}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to mention here, NumRecs is in encoded file. Is this noisy?

encodeULEB128(NC, OS);
}

void RecTy::encode(uint64_t &StartLoc, raw_ostream &OS) const {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused here. Let me try hiding.

} else if (ExtTag && *ExtTag == Skip) {
encodeULEB128(2 << 3, OS);
} else if (DecisionOpt) {
assert(!ExtTag || *ExtTag == Decision);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If DecisionOpt exists, ExtTag should be Decision or shouldn't exist.

encodeULEB128(5 << 3, OS);
DecisionOpt->encode(OS);
} else if (MCDC) {
assert(!ExtTag || *ExtTag == MCDCBranch);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existence of MCDC shall identify this record as MCDCBranch.

encodeULEB128((*MCDC)[1], OS);
encodeULEB128((*MCDC)[2], OS);
} else if (BranchOpt) {
assert(!ExtTag || *ExtTag == Branch);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!MCDC && BranchOpt shall identify this record as Branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants